301 research outputs found
Euler simulation of interacting particle systems and McKean-Vlasov SDEs with fully superlinear growth drifts in space and interaction
We consider in this work the convergence of a split-step Euler type scheme
(SSM) for the numerical simulation of interacting particle Stochastic
Differential Equation (SDE) systems and McKean-Vlasov Stochastic Differential
Equations (MV-SDEs) with full super-linear growth in the spatial and the
interaction component in the drift, and non-constant Lipschitz diffusion
coefficient.
The super-linear growth in the interaction (or measure) component stems from
convolution operations with super-linear growth functions allowing in
particular application to the granular media equation with multi-well confining
potentials. From a methodological point of view, we avoid altogether functional
inequality arguments (as we allow for non-constant non-bounded diffusion maps).
The scheme attains, in stepsize, a near-optimal classical (path-space) root
mean-square error rate of for and an optimal
rate in the non-path-space mean-square error metric. Numerical examples
illustrate all findings. In particular, the testing raises doubts if taming is
a suitable methodology for this type of problem (with convolution terms and
non-constant diffusion coefficients).Comment: 40 pages, 3 figures; Final author accepted version (to appear in IMA
J. of Num. Analysis
An iterative method for Helmholtz boundary value problems arising in wave propagation
The complex Helmholtz equation (where ) is a mainstay of computational wave
simulation. Despite its apparent simplicity, efficient numerical methods are
challenging to design and, in some applications, regarded as an open problem.
Two sources of difficulty are the large number of degrees of freedom and the
indefiniteness of the matrices arising after discretisation. Seeking to meet
them within the novel framework of probabilistic domain decomposition, we set
out to rewrite the Helmholtz equation into a form amenable to the Feynman-Kac
formula for elliptic boundary value problems. We consider two typical
scenarios, the scattering of a plane wave and the propagation inside a cavity,
and recast them as a sequence of Poisson equations. By means of stochastic
arguments, we find a sufficient and simulatable condition for the convergence
of the iterations. Upon discretisation a necessary condition for convergence
can be derived by adding up the iterates using the harmonic series for the
matrix inverse -- we illustrate the procedure in the case of finite
differences.
From a practical point of view, our results are ultimately of limited scope.
Nonetheless, this unexpected -- even paradoxical -- new direction of attack on
the Helmholtz equation proposed by this work offers a fresh perspective on this
classical and difficult problem. Our results show that there indeed exists a
predictable range in which this new ansatz works with
being far below the challenging situation.Comment: 21 pages, 6 Figures, 1 table
Wellposedness, exponential ergodicity and numerical approximation of fully super-linear McKean--Vlasov SDEs and associated particle systems
We study a class of McKean--Vlasov Stochastic Differential Equations
(MV-SDEs) with drifts and diffusions having super-linear growth in measure and
space -- the maps have general polynomial form but also satisfy a certain
monotonicity condition. The combination of the drift's super-linear growth in
measure (by way of a convolution) and the super-linear growth in space and
measure of the diffusion coefficient require novel technical elements in order
to obtain the main results. We establish wellposedness, propagation of chaos
(PoC), and under further assumptions on the model parameters we show an
exponential ergodicity property alongside the existence of an invariant
distribution. No differentiability or non-degeneracy conditions are required.
Further, we present a particle system based Euler-type split-step scheme
(SSM) for the simulation of this type of MV-SDEs. The scheme attains, in
stepsize, the strong error rate in the non-path-space root-mean-square
error metric and we demonstrate the property of mean-square contraction. Our
results are illustrated by numerical examples including: estimation of PoC
rates across dimensions, preservation of periodic phase-space, and the
observation that taming appears to be not a suitable method unless strong
dissipativity is present.Comment: 34 pages, 5 figure
Whisper-MCE: Whisper Model Finetuned for Better Performance with Mixed Languages
Recently Whisper has approached human-level robustness and accuracy in
English automatic speech recognition (ASR), while in minor language and mixed
language speech recognition, there remains a compelling need for further
improvement. In this work, we present the impressive results of Whisper-MCE,
our finetuned Whisper model, which was trained using our self-collected
dataset, Mixed Cantonese and English audio dataset (MCE). Meanwhile,
considering word error rate (WER) poses challenges when it comes to evaluating
its effectiveness in minor language and mixed-language contexts, we present a
novel rating mechanism. By comparing our model to the baseline whisper-large-v2
model, we demonstrate its superior ability to accurately capture the content of
the original audio, achieve higher recognition accuracy, and exhibit faster
recognition speed. Notably, our model outperforms other existing models in the
specific task of recognizing mixed language
A Hybrid Multiscale Framework for Subsurface Flow and Transport Simulations
AbstractExtensive research is aimed at improving predictive ability of biogeochemical earth and environmental system simulators, with applications ranging from contaminant transport and remediation to impacts of carbon and nitrogen cycling on local ecosystems and climate. Most process-based numerical models are designed for a single characteristic length and time scale. For application-relevant scales, it is necessary to introduce approximations and empirical parameterizations to describe complex systems because of limitations on process understanding, system characterization and computation. Using emerging understanding of biological and environmental processes at fundamental scales to advance predictions of the larger system behavior requires the development of multiscale simulators, and there is strong interest in coupling microscale and macroscale models together in a hybrid multiscale simulation. A limited number of hybrid multiscale simulations have been developed for biogeochemical systems, mostly using application-specific approaches for model coupling. We are developing a generalized approach to hierarchical model coupling designed for high-performance computational systems, based on the Swift computing workflow framework. In this presentation we will describe the generalized approach and provide two use cases: 1) simulation of a mixing-controlled biogeochemical reaction coupling pore- and continuum-scale models, and 2) simulation of biogeochemical impacts of groundwater–river water interactions coupling fine- and coarse-grid model representations. This generalized framework can be customized for use with any pair of linked models (microscale and macroscale) with minimal intrusiveness to the at-scale simulators. It combines a set of python scripts with the Swift workflow environment to execute a complex multiscale simulation utilizing an approach similar to the well-known Heterogeneous Multiscale Method. User customization is facilitated through user-provided input and output file templates and processing function scripts, and execution within a high-performance computing environment is handled by Swift, such that minimal to no user modification of at-scale codes is required
Dataless text classification with descriptive LDA
Manually labeling documents for training a text classifier is expensive and time-consuming. Moreover, a classifier trained on labeled documents may suffer from overfitting and adaptability problems. Dataless text classification (DLTC) has been proposed as a solution to these problems, since it does not require labeled documents. Previous research in DLTC has used explicit semantic analysis of Wikipedia content to measure semantic distance between documents, which is in turn used to classify test documents based on nearest neighbours. The semantic-based DLTC method has a major drawback in that it relies on a large-scale, finely-compiled semantic knowledge base, which is difficult to obtain in many scenarios. In this paper we propose a novel kind of model, descriptive LDA (DescLDA), which performs DLTC with only category description words and unlabeled documents. In DescLDA, the LDA model is assembled with a describing device to infer Dirichlet priors from prior descriptive documents created with category description words. The Dirichlet priors are then used by LDA to induce category-aware latent topics from unlabeled documents. Experimental results with the 20Newsgroups and RCV1 datasets show that: (1) our DLTC method is more effective than the semantic-based DLTC baseline method; and (2) the accuracy of our DLTC method is very close to state-of-the-art supervised text classification methods. As neither external knowledge resources nor labeled documents are required, our DLTC method is applicable to a wider range of scenarios
- …